Bibliography

195

[221] Chaofan Tao, Lu Hou, Wei Zhang, Lifeng Shang, Xin Jiang, Qun Liu, Ping Luo, and

Ngai Wong. Compression of generative pre-trained language models via quantization.

arXiv preprint arXiv:2203.10705, 2022.

[222] Jiayi Tian, Chao Fang, Haonan Wang, and Zhongfeng Wang. Bebert: Efficient and

robust binary ensemble bert. arXiv preprint arXiv:2210.15976, 2022.

[223] Naftali Tishby, Fernando C Pereira, and William Bialek. The information bottleneck

method. arXiv preprint physics/0004057, 2000.

[224] Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablay-

rolles, and Herv´e J´egou.

Training data-efficient image transformers & distillation

through attention. In International conference on machine learning, pages 10347–

10357. PMLR, 2021.

[225] VW-S Tseng, Sourav Bhattachara, Javier Fern´andez-Marqu´es, Milad Alizadeh,

Catherine Tong, and Nicholas D Lane. Deterministic binary filters for convolutional

neural networks. International Joint Conferences on Artificial Intelligence Organiza-

tion, 2018.

[226] Frederick Tung and Greg Mori. Similarity-preserving knowledge distillation. In Proc.

of ICCV, pages 1365–1374, 2019.

[227] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N

Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in

neural information processing systems, 30, 2017.

[228] Diwen Wan, Fumin Shen, Li Liu, Fan Zhu, Jie Qin, Ling Shao, and Heng Tao Shen.

Tbn: Convolutional neural network with ternary inputs and binary weights. In Pro-

ceedings of the European Conference on Computer Vision (ECCV), pages 315–332,

2018.

[229] Diwen Wan, Fumin Shen, Li Liu, Fan Zhu, Jie Qin, Ling Shao, and Heng Tao Shen.

Tbn: Convolutional neural network with ternary inputs and binary weights. In Pro-

ceedings of the European Conference on Computer Vision, pages 315–332, 2018.

[230] Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R

Bowman. Glue: A multi-task benchmark and analysis platform for natural language

understanding. arXiv preprint arXiv:1804.07461, 2018.

[231] Guo-Hua Wang, Yifan Ge, and Jianxin Wu. Distilling knowledge by mimicking fea-

tures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

[232] Jingya Wang, Xiatian Zhu, Shaogang Gong, and Wei Li. Transferable joint attribute-

identity deep learning for unsupervised person re-identification. In Proceedings of the

IEEE Conference on Computer Vision and Pattern Recognition, pages 2275–2284,

2018.

[233] Peisong Wang, Qinghao Hu, Yifan Zhang, Chunjie Zhang, Yang Liu, and Jian Cheng.

Two-step quantization for low-bit neural networks. In Proceedings of the IEEE Con-

ference on computer vision and pattern recognition, pages 4376–4384, 2018.

[234] Song Wang, Dongchun Ren, Li Chen, Wei Fan, Jun Sun, and Satoshi Naoi.

On

study of the binarized deep neural network for image classification. arXiv preprint

arXiv:1602.07373, 2016.